1,173 research outputs found

    Generalized Shortest Path Kernel on Graphs

    Full text link
    We consider the problem of classifying graphs using graph kernels. We define a new graph kernel, called the generalized shortest path kernel, based on the number and length of shortest paths between nodes. For our example classification problem, we consider the task of classifying random graphs from two well-known families, by the number of clusters they contain. We verify empirically that the generalized shortest path kernel outperforms the original shortest path kernel on a number of datasets. We give a theoretical analysis for explaining our experimental results. In particular, we estimate distributions of the expected feature vectors for the shortest path kernel and the generalized shortest path kernel, and we show some evidence explaining why our graph kernel outperforms the shortest path kernel for our graph classification problem.Comment: Short version presented at Discovery Science 2015 in Banf

    A novel string representation and kernel function for the comparison of I/O access patterns

    Get PDF
    Parallel I/O access patterns act as fingerprints of a parallel program. In order to extract meaningful information from these patterns, they have to be represented appropriately. Due to the fact that string objects can be easily compared using Kernel Methods, a conversion to a weighted string representation is proposed in this paper, together with a novel string kernel function called Kast Spectrum Kernel. The similarity matrices, obtained after applying the mentioned kernel over a set of examples from a real application, were analyzed using Kernel Principal Component Analysis (Kernel PCA) and Hierarchical Clustering. The evaluation showed that 2 out of 4 I/O access pattern groups were completely identified, while the other 2 conformed a single cluster due to the intrinsic similarity of their members. The proposed strategy can be promisingly applied to other similarity problems involving tree-like structured data

    Space-efficient Feature Maps for String Alignment Kernels

    Get PDF
    String kernels are attractive data analysis tools for analyzing string data. Among them, alignment kernels are known for their high prediction accuracies in string classifications when tested in combination with SVM in various applications. However, alignment kernels have a crucial drawback in that they scale poorly due to their quadratic computation complexity in the number of input strings, which limits large-scale applications in practice. We address this need by presenting the first approximation for string alignment kernels, which we call space-efficient feature maps for edit distance with moves (SFMEDM), by leveraging a metric embedding named edit sensitive parsing (ESP) and feature maps (FMs) of random Fourier features (RFFs) for large-scale string analyses. The original FMs for RFFs consume a huge amount of memory proportional to the dimension d of input vectors and the dimension D of output vectors, which prohibits its large-scale applications. We present novel space-efficient feature maps (SFMs) of RFFs for a space reduction from O(dD) of the original FMs to O(d) of SFMs with a theoretical guarantee with respect to concentration bounds. We experimentally test SFMEDM on its ability to learn SVM for large-scale string classifications with various massive string data, and we demonstrate the superior performance of SFMEDM with respect to prediction accuracy, scalability and computation efficiency.Comment: Full version for ICDM'19 pape

    Extending local features with contextual information in graph kernels

    Full text link
    Graph kernels are usually defined in terms of simpler kernels over local substructures of the original graphs. Different kernels consider different types of substructures. However, in some cases they have similar predictive performances, probably because the substructures can be interpreted as approximations of the subgraphs they induce. In this paper, we propose to associate to each feature a piece of information about the context in which the feature appears in the graph. A substructure appearing in two different graphs will match only if it appears with the same context in both graphs. We propose a kernel based on this idea that considers trees as substructures, and where the contexts are features too. The kernel is inspired from the framework in [6], even if it is not part of it. We give an efficient algorithm for computing the kernel and show promising results on real-world graph classification datasets.Comment: To appear in ICONIP 201

    GRP78 expression in canine mammary tumors: association with malignancy

    Get PDF
    78-kDa glucose-regulated protein (GRP78) is over-expressed in human breast carcinomas. GRP78 expression was studied in 40 spontaneous canine mammary tumors and evaluated in relation to tumor histological type, mode of growth, grade, lymph node metastases and distant metastases. All tumors exhibited GRP78 immunostaining. In the normal canine mammary gland, GRP78 was also expressed although not in all cases. In carcinomas GRP78 was detected in the cytoplasm in more than 50% of tumor cells in the vast majority of cases (87.5%). There was a significant association between the absence of squamous differentiation (P = 0.02) and GRP78 over-expression, but no association with other clinico-pathological features. GRP78 was often co-expressed with galectin-3 in canine mammary tumors (CMT).

    Micrometer-sized Water Ice Particles for Planetary Science Experiments: Influence of Surface Structure on Collisional Properties

    Get PDF
    Models and observations suggest that ice-particle aggregation at and beyond the snowline dominates the earliest stages of planet formation, which therefore is subject to many laboratory studies. However, the pressure–temperature gradients in protoplanetary disks mean that the ices are constantly processed, undergoing phase changes between different solid phases and the gas phase. Open questions remain as to whether the properties of the icy particles themselves dictate collision outcomes and therefore how effectively collision experiments reproduce conditions in protoplanetary environments. Previous experiments often yielded apparently contradictory results on collision outcomes, only agreeing in a temperature dependence setting in above ≈210 K. By exploiting the unique capabilities of the NIMROD neutron scattering instrument, we characterized the bulk and surface structure of icy particles used in collision experiments, and studied how these structures alter as a function of temperature at a constant pressure of around 30 mbar. Our icy grains, formed under liquid nitrogen, undergo changes in the crystalline ice-phase, sublimation, sintering and surface pre-melting as they are heated from 103 to 247 K. An increase in the thickness of the diffuse surface layer from ≈10 to ≈30 Å (≈2.5 to 12 bilayers) proves increased molecular mobility at temperatures above ≈210 K. Because none of the other changes tie-in with the temperature trends in collisional outcomes, we conclude that the surface pre-melting phenomenon plays a key role in collision experiments at these temperatures. Consequently, the pressure–temperature environment, may have a larger influence on collision outcomes than previously thought

    Inductive queries for a drug designing robot scientist

    Get PDF
    It is increasingly clear that machine learning algorithms need to be integrated in an iterative scientific discovery loop, in which data is queried repeatedly by means of inductive queries and where the computer provides guidance to the experiments that are being performed. In this chapter, we summarise several key challenges in achieving this integration of machine learning and data mining algorithms in methods for the discovery of Quantitative Structure Activity Relationships (QSARs). We introduce the concept of a robot scientist, in which all steps of the discovery process are automated; we discuss the representation of molecular data such that knowledge discovery tools can analyse it, and we discuss the adaptation of machine learning and data mining algorithms to guide QSAR experiments

    Surface morphology of AlGaN/GaN heterostructures grown on bulk GaN by MBE

    Get PDF
    In this report the influence of the growth conditions on the surface morphology of AlGaN/GaN heterostructures grown on sapphire-based and bulk GaN substrates is nondestructively investigated with focus on the decoration of defects and the surface roughness. Under Ga-rich conditions specific types of dislocations are unintentionally decorated with shallow hillocks. In contrast, under Ga-lean conditions deep pits are inherently formed at these defect sites. The structural data show that the dislocation density of the substrate sets the limit for the density of dislocation-mediated surface structures after MBE overgrowth and no noticeable amount of surface defects is introduced during the MBE procedure. Moreover, the transfer of crystallographic information, e.g. the miscut of the substrate to the overgrown structure, is confirmed. The combination of our MBE overgrowth with the employed surface morphology analysis by atomic force microscopy (AFM) provides a unique possibility for a nondestructive, retrospective analysis of the original substrate defect density prior to device processing

    Tensile strained InxGa1−xPIn_{x}Ga_{1-x}P membranes for cavity optomechanics

    Get PDF
    We investigate the optomechanical properties of tensile-strained ternary InGaP nanomembranes grown on GaAs. This material system combines the benefits of highly strained membranes based on stoichiometric silicon nitride, with the unique properties of thin-film semiconductor single crystals, as previously demonstrated with suspended GaAs. Here we employ lattice mismatch in epitaxial growth to impart an intrinsic tensile strain to a monocrystalline thin film (approximately 30 nm thick). These structures exhibit mechanical quality factors of 2*10^6 or beyond at room temperature and 17 K for eigenfrequencies up to 1 MHz, yielding Q*f products of 2*10^12 Hz for a tensile stress of ~170 MPa. Incorporating such membranes in a high finesse Fabry-Perot cavity, we extract an upper limit to the total optical loss (including both absorption and scatter) of 40 ppm at 1064 nm and room temperature. Further reductions of the In content of this alloy will enable tensile stress levels of 1 GPa, with the potential for a significant increase in the Q*f product, assuming no deterioration in the mechanical loss at this composition and strain level. This materials system is a promising candidate for the integration of strained semiconductor membrane structures with low-loss semiconductor mirrors and for realizing stacks of membranes for enhanced optomechanical coupling.Comment: 10 pages, 3 figure

    The functional readthrough extension of malate dehydrogenase reveals a modification of the genetic code

    No full text
    Translational readthrough gives rise to C-terminally extended proteins, thereby providing the cell with new protein isoforms. These may have different properties from the parental proteins if the extensions contain functional domains. While for most genes amino acid incorporation at the stop codon is far lower than 0.1%, about 4% of malate dehydrogenase (MDH1) is physiologically extended by translational readthrough and the actual ratio of MDH1x (extended protein) to ‘normal' MDH1 is dependent on the cell type. In human cells, arginine and tryptophan are co-encoded by the MDH1x UGA stop codon. Readthrough is controlled by the 7-nucleotide high-readthrough stop codon context without contribution of the subsequent 50 nucleotides encoding the extension. All vertebrate MDH1x is directed to peroxisomes via a hidden peroxisomal targeting signal (PTS) in the readthrough extension, which is more highly conserved than the extension of lactate dehydrogenase B. The hidden PTS of non-mammalian MDH1x evolved to be more efficient than the PTS of mammalian MDH1x. These results provide insight into the genetic and functional co-evolution of these dually localized dehydrogenases
    • 

    corecore